# Efficient Speech Recognition

Parakeet Tdt 0.6b V2 Mlx
This is an automatic speech recognition model that has been converted to a version suitable for MLX and can perform inference quickly.
Speech Recognition Safetensors English
P
senstella
183
6
Faster Distil Whisper Large V3.5
MIT
Distil-Whisper is a distilled version of the Whisper model, optimized for Automatic Speech Recognition (ASR) tasks, offering faster inference speeds.
Speech Recognition English
F
Purfview
565
2
Faster Distil Whisper Large V3.5
MIT
A CTranslate2 format model converted from Distil-Whisper large-v3.5 for efficient speech recognition
Speech Recognition English
F
deepdml
58.15k
2
Distil Large V3.5 ONNX
MIT
Distil-Whisper is a knowledge-distilled version of OpenAI Whisper-Large-v3, offering superior performance and efficiency.
Speech Recognition Transformers English
D
distil-whisper
25
1
Distil Large V3.5 Ct2
MIT
Distil-Whisper is a distilled version of the Whisper model, achieving efficient speech recognition through large-scale pseudo-labeling technology
Speech Recognition English
D
distil-whisper
264
3
Distil Large V3.5
MIT
Distil-Whisper is a knowledge-distilled version of OpenAI Whisper-Large-v3, achieving efficient speech recognition through large-scale pseudo-label training.
Speech Recognition Transformers English
D
distil-whisper
4,804
25
Faster Whisper V2 D4
Apache-2.0
This is an optimized Hebrew and English speech recognition model based on the Whisper model, developed by ivrit.ai.
Speech Recognition Supports Multiple Languages
F
ivrit-ai
696
16
Distil Large V3
MIT
Distil-Whisper is a knowledge-distilled version of Whisper large-v3, focusing on English automatic speech recognition, offering faster inference speeds while maintaining accuracy close to the original model.
Speech Recognition English
D
distil-whisper
417.11k
311
Parakeet Tdt 1.1b
Parakeet TDT 1.1B is an automatic speech recognition (ASR) model jointly developed by NVIDIA NeMo and Suno.ai, capable of transcribing speech into lowercase English letters.
Speech Recognition English
P
nvidia
12.27k
90
Faster Distil Whisper Medium.en
MIT
This is a version of the distil-whisper/distil-medium.en model converted to CTranslate2 format for efficient speech recognition tasks.
Speech Recognition English
F
Systran
6,155
4
Faster Distil Whisper Large V2
MIT
This is a distilled version of the automatic speech recognition (ASR) model based on the Whisper architecture, designed for efficient inference and suitable for English speech-to-text tasks.
Speech Recognition English
F
Systran
1,336
19
Sew D Mid 400k Ft Ls100h
Apache-2.0
SEW-D-mid is a speech pre-training model developed by ASAPP Research, focusing on automatic speech recognition tasks, achieving a good balance between performance and efficiency.
Speech Recognition Transformers English
S
asapp
20
1
Sew D Tiny 100k Ft Ls100h
Apache-2.0
SEW-D-tiny is an efficient speech recognition pre-trained model developed by ASAPP Research, focusing on the balance between performance and efficiency.
Speech Recognition Transformers English
S
asapp
24.55k
2
Sew Tiny 100k
Apache-2.0
SEW-tiny is a compressed and efficient speech pretraining model developed by ASAPP Research, pretrained on 16kHz sampled speech audio, suitable for various downstream speech tasks.
Speech Recognition Transformers Supports Multiple Languages
S
asapp
1,080
3
Sew Tiny 100k Ft Ls100h
Apache-2.0
SEW (Squeezed and Efficient Wav2vec) is a speech recognition pre-trained model developed by ASAPP Research, outperforming wav2vec 2.0 in both performance and efficiency.
Speech Recognition Transformers Supports Multiple Languages
S
asapp
736
1
Sew D Base Plus 400k Ft Ls100h
Apache-2.0
SEW-D-base+ is an efficient speech recognition model developed by ASAPP Research, pre-trained on 16kHz sampled speech audio, and excels on the LibriSpeech dataset.
Speech Recognition Transformers English
S
asapp
66
4
Sew D Mid K127 400k Ft Ls100h
Apache-2.0
SEW-D-mid-k127 is an efficient speech recognition pre-trained model developed by ASAPP Research, demonstrating significant improvements in performance and efficiency compared to wav2vec 2.0.
Speech Recognition Transformers English
S
asapp
16
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase